2 research outputs found
Fair Algorithms for Hierarchical Agglomerative Clustering
Hierarchical Agglomerative Clustering (HAC) algorithms are extensively
utilized in modern data science, and seek to partition the dataset into
clusters while generating a hierarchical relationship between the data samples.
HAC algorithms are employed in many applications, such as biology, natural
language processing, and recommender systems. Thus, it is imperative to ensure
that these algorithms are fair -- even if the dataset contains biases against
certain protected groups, the cluster outputs generated should not discriminate
against samples from any of these groups. However, recent work in clustering
fairness has mostly focused on center-based clustering algorithms, such as
k-median and k-means clustering. In this paper, we propose fair algorithms for
performing HAC that enforce fairness constraints 1) irrespective of the
distance linkage criteria used, 2) generalize to any natural measures of
clustering fairness for HAC, 3) work for multiple protected groups, and 4) have
competitive running times to vanilla HAC. Through extensive experiments on
multiple real-world UCI datasets, we show that our proposed algorithm finds
fairer clusterings compared to vanilla HAC as well as other state-of-the-art
fair clustering approaches
DynaQuant: Compressing Deep Learning Training Checkpoints via Dynamic Quantization
With the increase in the scale of Deep Learning (DL) training workloads in
terms of compute resources and time consumption, the likelihood of encountering
in-training failures rises substantially, leading to lost work and resource
wastage. Such failures are typically offset by a checkpointing mechanism, which
comes at the cost of storage and network bandwidth overhead. State-of-the-art
approaches involve lossy model compression mechanisms, which induce a tradeoff
between the resulting model quality (accuracy) and compression ratio. Delta
compression is then used to further reduce the overhead by only storing the
difference between consecutive checkpoints. We make a key enabling observation
that the sensitivity of model weights to compression varies during training,
and different weights benefit from different quantization levels (ranging from
retaining full precision to pruning). We propose (1) a non-uniform quantization
scheme that leverages this variation, (2) an efficient search mechanism that
dynamically finds the best quantization configurations, and (3) a
quantization-aware delta compression mechanism that rearranges weights to
minimize checkpoint differences, thereby maximizing compression. We instantiate
these contributions in DynaQuant - a framework for DL workload checkpoint
compression. Our experiments show that DynaQuant consistently achieves a better
tradeoff between accuracy and compression ratios compared to prior works,
enabling a compression ratio up to 39x and withstanding up to 10 restores with
negligible accuracy impact for fault-tolerant training. DynaQuant achieves at
least an order of magnitude reduction in checkpoint storage overhead for
training failure recovery as well as transfer learning use cases without any
loss of accuracy